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ABSTRACT 

In this paper, we present the asymptotic distribution of M-estimators for parameters in un¬ 
stable AR(p) processes. The innovations are assumed to be in the domain of attraction of a 
symmetric stable law with index 0 < a < 2. In particular, in the case of repeated unit roots or 
conjugate complex unit roots, M-estimators have a higher asymptotic rate of convergence com¬ 
pared to the least square estimators and the asymptotic results can be written as ltd stochastic 
integrals. 


AMS classification: 62M10, 60G52, 62F40. 

Keywords: Autoregressive model. Unit root. Stable process, Non-stationary, Bootstrapping. 

1 Introduction 

Consider the autoregressive proeess of order p (AR(p)) 

4>{B)Xt = et, (1.1) 

where B is the baekward operator and 

(j){z) = 1 - filZ - (t>2Z^ - ... - fipZ^. (1.2) 
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The errors {et} in (O' form a sequence of independent and identically distributed (i.i.d.) random 
variables in the domain of attraction of a symmetric stable law with index 0 < a < 2. The model 
(O is referred to as non-stationary autoregressive time series if the characteristic polynomial (/>(•) 
has at least one root on the boundary of the unit circle. 

It is well known that the unit root tests are particularly an important tool to classify if a time 
series is stationary or non-stationary. When economic variables are non-stationary, estimates may 
generate a spurious model unless they are cointegrated. A unit root test can be used for cointegra¬ 
tion of two processes. The analysis of unit-root processes and cointegrated time series is likely to 
be the one of the most important and controversial topics in econometrics in the last few decades. 
The case where innovations are in the domain of attraction of the Gaussian distribution has received 
considerable attention in cointegration literature; see for example Engle and Granger (1987) and 
Park and Phillips (1988). However, many empirical studies show that heavy-tailed and asymmetri¬ 
cally distributed samples are frequently observed in economic and, especially, financial time series. 
In these cases, the Gaussian models are not applicable. Paulauskas and Rachev (1998) develop the 
asymptotic theory for econometric cointegration processes under the assumption of infinite variance 
innovations with different tail indices. 

The problem of conducting asymptotic inference for time series with unit roots has been a 
challenging topic of interest for some time. In cases where errors (innovations) have finite variance, 
Dickey and Fuller (1979), and Phillips and Perron (1988) provide the asymptotic theory for the 
least squares (LS) estimators in an AR(1) process with one unit root. Chan and Wei (1988) study 
the large sample theory for a non-stationary autoregressive AR(p) model when the innovations form 
a sequence of martingale differences with respect to an increasing sequence of cr-fields 

With infinite variance innovations, Chan and Tran (1989) consider the Dickey-Fuller test when 
the errors are in the domain of attraction of a stable law. Phillips (1990) extends the results of 
Chan and Tran (1989) to find fhe limit theory of the parameters in an AR(1) process with weakly 
dependent errors in the domain of attraction of a stable law. Since both the Dickey-Fuller and 
Phillips-Perron statistics are based on FS estimation, they do not take advantage of the heavy tails 
of the innovations and can exhibit rather poor power performance. Thus, it is important to consider 
estimation and inference procedures that are robust to departures from finite variance condition. 
One way to achieve robustness is the use of the M-estimate method. With an appropriate choice 
of a loss function, M-estimates have a number of desirable properties when the errors are heavy 
tailed. Knight (1989) considers the asymptotic behavior of the FS estimates and M-estimates for 
the random walk model. The results establish that self-normalized M-estimates are asymptotically 
normal and their rate of convergence is higher than the FS estimates. Davis, Knight, and Fiu (1992) 
mention that M-estimates are more appropriate when the distribution of innovations are heavy¬ 
tailed. This follows from the fact that M-estimates give less weight to the outliers. Samarakoon and 
Knight (2009) develop a class of unit root tests based on M-estimators in an AR process with a unit 
root derived by infinite variance innovations. 

Chan and Zhang (2012) obtain the limiting distribution for the FS estimates of the parameters 
for unstable AR(p) processes, with i.i.d. innovations in the domain of attraction of a stable law. They 
show that the limiting distribution of the FS estimate is a function of integrated stable processes. 
However, for model (|1.1|) with unit roots when {et} is a sequence of random variables with infinite 
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variance, a complete theory on a more efficient estimating technique is still missing in the literature. 

In this paper, we consider an important class of unstable autoregressive time series models with 
many practical implications. An example of these time series are seasonal models, where (j){z) 
may have several real and complex conjugate roots on the unit circle. We derive the asymptotic 
distribution of M-estimators for the parameters in an unstable AR(p) process, where the innovations 
are in the domain of attraction of a stable law with index 0 < a < 2. Our results show that, 
similar to the previous cases, M-estimators have higher asymptotic rate of convergence than LS 
estimators. This paper is organized as follows. Section 2 provides some necessary preliminary 
concepts. In Section 3, the limiting distribution of M-estimates in an AR(p) model is presented and 
Section 4 consists of our simulation study. Due to the complexity of the limiting distributions a brief 
discussion and a bootstrap simulation scheme are presented in Section 5. We summarize our results 
in Section 6 and finally, the proof of the main theorem is outlined in Appendix A. 

2 Preliminaries 

Consider the AR(p) model in ( |1.1[ ) with characteristic polynomial in 
^ = (<^i, <^ 2 , • • ■, of = {( 1 ) 1 , (j) 2 ,..., (/>p)T minimizes 

n 

Y, p{Xt-(3iXt-i - (3pXt-p), 

t=p+i 

with respect to (/3i,..., /3p), where p is an almost everywhere differentiable convex function. This 
guarantees the uniqueness of the solution. For more details see Davis et al. (1992). Usually, p{x) 
grows at a slower rate than x^, as |x| gets large. An example for p{-) is the Huber loss function 
given by 


(1.2i. The classical M-estimator 


Ph (x) = \x^I (|x| < c) + (c |x| — I (|x| > c) (2.1) 

for a known constant c, where /(•) denotes the indicator function. Throughout this paper we impose 
the following assumptions on the function p{-). 

Assumption 1. (Al ) Let p be a convex and twice differentiable function, and take V’ = p'- 
Assumption!. (A2) E('!/)(ei)) = 0 andE(yi^(ei)) < oo. 

Assumption 3. (AS) 0 < |E('0'(ei))| < oo and satisfies the Lipschitz- continuity condition; 
i.e., there exists a real constant k ^ 0 such that for all x and y, 

W{x) - il)'{y)\ < k\x - y\. 

Note that for the Assumptions A1-A3, sometimes p' does not exist everywhere. In this case, al¬ 
though p' is not differentiable at a countable number of points, the results will usually hold with 
some additional complexity in the proofs. Moreover, we assume that the innovations {et} satisfy: 
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Assumption 4. (A4) The innovations {e*} are i.i.d. random variables in the domain of attraction 
of a stable law with index 0 < a < 2. Note that for 0 < a < 2, the innovations have regularly 
varying tail probabilities as specified by 

P(|ei| > x) = x~°‘L{x) 


for some slowly varying function L at oo with a > 0 as x ^ oo, 


P{ei > x) P{ei < —x) 

P(|ei| > x)^^ ’ P(|ei| > x) 


,0<p<l, q = l-p. 


Symmetry is a common assumption for innovations. For 0 < a < 1, symmetry is not required. 
However, for 1 < a < 2 we only need E{ei) = 0. For sake of simplicity we assume symmetry on 
innovations; i.e., p = q = 1/2; see Assumption A4 implies that: 

[nt] 

5n(f) = a-i 4 5(f) in D[0,1], (2.2) 

k=l 

where -4- denotes convergence in distribution with respect to the Skorohod topology and [x] stands 
for integer part of x. The Skorohod space of the cadlag functions defined on [0,1], equipped with 
the Skorohod topology, is denoted by D = i2[0,1]. Here, {an} is a sequence of positive constants 
such that 


= inf{x : P[|ei| > x] < n } 


(2.3) 


Moreover, 5(-) is a stable process and from Resnick and Greenwood (1979) (see also Knight 
(1989)), the representation for 5 can be shown to be 


S{t) = 


= f if0<a<2, 


standard Brownian motion if a = 2. 


(2.4) 


Here and throughout this paper, {Uk} is a sequence of i.i.d. f7[0,1] random variables and {<5^} is a 
sequence of i.i.d. random variables such that P((ifc = 1) = p, P{5k = —1) = q, andp+q = 1. Also, 
Fi, r 2 ,... are the arrival times of a Poisson process with Lebesgue mean measure and independent 


of Note that {Uk,Tk,^k} are mutually independent and the series in p.4[ ) is convergent if 

either 0<a< lorp = q = l/2. For more details, see LePage, Woodroofe, and Zinn (1981). 

To derive the main result of this paper, we assume that conditions A1-A4 hold and we define 
fhe following processes on fhe Skorohod space 79[0,1]: 


it] 


T„(f) = 




T2,n{t) 


'll] 




R„(f) = 


R2,n{l) 

[nt] 


= n-i/2 


[nt] 

E 

k=l 


sin ({k — 1)6) 
cos ((fc — 1)0) 

[nt] 


k=l 


V'(efc), 


cos{k6) 

sin{k6) 




W„{t)=n Vnit)=n ^ ['0'(efc) - E (t/'(efc)) ]. 






(2.5) 
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Similar to Theorem 4 of Resnick and Greenwood (1979), we can show that 


where S{-) and S'(^)(-) are stable processes and T(-) is a bivariate stable process which is de¬ 
fined in Lemma[^(see ( |7.1| )). Also, W{-) and 17(.) are standard Brownian-motion processes with 
E{W‘^{t)) = fE^^(ei)) andE(17^(f)) = fVar ('i/)'(ei)). Finally, since both sin^((fc — 1)9) 

and X]fc=i ((^ “ 1)^) Op{n) for large values of n, the Lindeberg Feller central limit theo¬ 
rem and the tightness of the partial sum process Rn(-) give 



2 


Wi(f) \ 

W2(f) J • 


(2.7) 


Here, Wi(-), and W 2 (-) are independent standard Brownian motion processes. For 0 < a < 
2, {S{-), S^^)(-), T(-))^ is independent of (lE(-), R(-) , E(-))^. The dependence structure can be 
constructed by applying the continuous mapping theorem on ( |5.1[ ) in Appendix 1. For more details 
see Resnick and Greenwood (1979). 


3 The limiting distribution for AR(p) 


The limiting distribution for the M-estimates of the parameters for an infinite-variance random-walk 
processes is obtained in Knight (1989). To generalize, we extend the results of Knight (1989) to the 
AR(p) process when characteristic roots may have different multiplicities and lie on the unit circle. 
To derive the asymptotic behavior of the M-estimates, consider the AR(p) model in <0 when the 
errors satisfy Assumption A4. Define fhe process 


^n(7l) • • • ) 7p) — ^ ^ 1 ■■■ Qp^Up^t—p) p{u) 


(3.1) 


t=p+l 


where and = diag(b,i^,..., bnp) is the matrix of appropriate normalizing 

constants (see Davis et al. (1992)). Note that the diagonal entries of vary according to different 
roots with different multiplicities and they will be specified in Theorem [T] Thus, if is reasonable fo 
expecf fhaf fhe minimizer of fhe process A^ can be wriffen as 

(<^1 - (pl, 02 - </'2, • • • , (3.2) 


Here 


Tn =® 



^ Xt-Met), Y. At_20(ei)i ■ ■ • I "Y/ 2ft-p'0(et) 


t=p+i 


t=p+i 


and Vn = {df^j) is ap x p mafrix such fhaf 




■ j:7=p+iXt-iXt-pUi4" 



j:?^p+^xt-rXt-2i^'(cy) 



) 




■ j:p=p+iXLpp'i4"^) 

J 



(3.3) 
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where — e^l < \qih^lXt-\ — ■ ■ ■ — qpb^^Xt-p\. Asymptotically, can be replaced 

by 'tjj'iu) in ( |3.3| ). To see the proof, we assume that bm = for all i = 1,... ,p. Since 

|V''(et) - < k\ qin-^^‘^a-^Xt-i\, we have 


n 


-1 -2 
i n. > 


n 2 ^ Xt-iXt-j\'4)'{et)-i}'{c^f’)\<k'^\qk 

t=p+l k=l 


P 




E 

t=p+l 


X-t — z-X^ — j Xi^ —^ I 


0 . 


In Theoremj^ we show that re^/^Ort is the minimum value for b^, i = 1,... ,p. Thus, this result 
holds for all other normalizing constants. For the other entries of !D„, results follow similarly. 
Furthermore, in ( |7.6| ) (Appendix A) we prove that asymptotically each can be replaced by 

E Therefore, as n —)• oo, the matrix will be nonsingular with probability 1. 

The limiting distribution of the parameters in ( |3.2| ) is obtained through several steps. First, since 
time series with different characteristic roots are expected to behave differently, we can decompose 


the characteristic polynomial defined in (1.2 1 into the following polynomial: 


(f){z) = (1 - zY{l + z)* (1 - 2 cos{9k)z + z'^Y’^ifiz), 


(3.4) 


k=l 


where polynomial ip{-) corresponds to q roots which are outside the unit circle and q + r + s + 
2 Ylk=i = P- Similar to Chan and Wei (1988), we transform {Xt} into various components based 
on the location of their roots. Then we find the limiting behavior of each component individually. 
Davis et al. (1992) consider the asymptotic behavior of M-estimate for the parameters in a stationary 
AR(p) process. In this paper we take p{z) = 1. Define ut = — B)~''Xt, vt = 4>{B){1 + 

B)~^Xt, and wt{k) = 0(i?)(l — 2 cos{6k)B + B‘^)~‘^^Xt for /c = 1, 2,..., /. Equivalently, 

et = (1 - BYut = (1 + BYvt = (1 - 2 cos{ek)B + B^Y>^wt{k). 


From Chan and Wei (1988), there exists a nonsingular p x p matrix Q such that 

QXt = (uT,vJ,w/(l),...,w[(/))\ 

where Xt = {Xt,Xt-p+iY, ut = (ut,..., ut-r+iY, vt = {vt,vts+iY, and wt(A:) = 
{wt{k),.. .,wt- 2 dt,+i{k)Yfork = 1 , 2 ,. ..,(. Moreover, let Gn = diag {Jn , Kn, Ln{l), ■ ■ ■,Ln{l)) 
be a normalization matrix, where and Kn are as specified in ( |7.4[ ) and ( |7.11 1 of Appendix A. 
Also, L,i(i) for 1 < f < Z < p are defined similar to Ln in ( |7.18[ ) in Appendix A. Then we have 

GnQXt ~ diag(JnUt,A:nVt,Ln(l)’Wt(l),--.,Tn(Z)wt(Z)) 


and 


(QTGT)-i($-ch) 


p 

r\j 


(EiL,. + lUt„lUt_lT^'(eO)' EL.+ 1 Ut-1^(€0 
(Kir^ (eLs+ 1 vt-ivt-Dv^'f^o) EL.+i vt-i^(60 

(i'n(l)D-'(EtE2di + lWt-l(l)wt_lT(l)V>'(^t))''Er=2d,+lWt-l(l)l^(6t) 


V (■^^n(0D-^(Ei*=2d,+iWt-i(0wt-C(0V''(^O) 'EiL2d,+iWt-i(0V’(^O 

{3.5) 
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where an ~ bn means an = bn + Op(l). From ( |3.5| ), we can find the weak limit behavior of 
M-estimates for the AR(p) processes defined in (O- The result is presented in Theorem 1. 

Theorem 1. Suppose {Xt} satisfies O' and conditions A1-A4 hold. Then 


(QTGT))-1 (d - <h) 4 ((T-i J)T, (Ar'9i)T,..., {AT^SiYY , (3.6) 

where (T^^T) and respectively, are defined in ( |7.9[ ) and ( |7.14| ) in Appendix A. Also, 

(A“^Si) for i = 1,... ,l are given similar to A“^S in ( |7.21| ). Note that, these limiting distributions 
are functional of multiple stochastic integrals of stable processes. 


Proof. See Appendix A. 


An illustration of the validity of the results in Theorem 1 is given by the following example for 
an AR(2) process. 


Example 1. Suppose {Xt} is an AR(2) process. Consider the following cases, 
(i) When (f>{z) = 1 — 2z cos 6 + z^. 


„i/2„ ( (t>i-2cos9 \ d I E(v-'(£i))(/o AAt)<i6)+/o T’lWAt)) 
^ ^2 + 1 ] ^ ^ 1 1 2singEV^(v,^(s,))r. 


2sineEi/^(V-^(ei))ri 
i))(/o AAt)<i6)+/o T’K 
2sin6»Ei/^(V>^(ei))r2 
E(V>'(£i))(/o AAt)<i6)+/o Ti(t)d(t)) 


where 


rt=cose(^J^ Ti{t)dRi{t) - T2{t)dR2{t)^ +sm0(^J^ Ti{t)dR2{t) + T2{t)dRi{t) 


and 


T2= f Tt{t)dRi{t)- [ T2{t)dR2{t), ri = 
Jo Jo 

(ii) When 4>{z) = 1 — 2z + z"^. 


1 cosO 
cos 9 1 


_ 2) 

- 2 ) + + 1 ) 


4 r-^ 


E'/"(V>Aei))/o S(t) dW(t) 
E(V>'(ei)) 

E(V-'Ci)) 


r2 = 


Y (t) dt Y A (f) fo S (s) ds dt \ 

lo ^ (^) fo ^ fo (fo ^ J 


where 
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(iii) When 4>{z) = I + 2z + z‘^. 


+ 2) - nz'/'^an{^2 + 1) 
n^/^a„((^2 + 1) 


4 -rji 


f* S(i)(s) d.dW{t) 

EU'iei)) 

UU(ei)) 


where 


r3 = 


fo (fo (®) Jo (^) fo (■s) ds dt 

fo W fo ('S) fo {f)f dt 


(iv) When (p{z) = 1 — z'^. 


( n^/2a„^i/2 + n^/^a„((^2 - l)/2 \ d 
■rTl'^anf)il2-'rTl'^an{(i)2-T)l2 ) 


E^/-^{i,\eP)j^S{t)dW{t) \ 

E(V-'( 6 i))/o^S=(t)dt \ 

E'''"(V>"(£i))/o S<i)(t)dW(t) I > 

E(^'(ei))/o' (S(E(t))"dt / 


where S(-), T(-), and R(-) are defined in p.6[). 


Remark 1. Example 1 shows a higher convergence rate of the M-estimates compared to the classi¬ 
cal LS estimates in Chan and Zhang (2012). For instance, the rate of convergence of the M-estimates 
in case (i) is nfl’^On which is significantly higher than that ofLS estimate at the rate of n, specially 
for small values of a. Moreover, the results in Example 1, cases (ii) and (iii), show different con¬ 
vergence rates for the estimators. Broadly speaking, the linear combination offi and (/>2 converges 
faster than each of them individually; see Chan and Zhang (2012). 

4 Simulation 

To investigate the results given in Theorem 1, we carry out a simulation for model ( |1.1[ ), when p = 2 
and 4i{B) = 1 — fiB — 4>2B‘^ has complex conjugate unit roots. In other words, we illustrate the 
asymptotic result of case (i) in Example 1 by simulating the following AR(2) process 


Xt — 2cos6Xt-i — Xt -2 + U, (4.1) 

where the innovations are i.i.d. symmetric a-stable random variables. Figure [T] shows different 
sample paths of Model ( |4.1| ) when a = 1.3 and n = 500. For the sake of brevity, we only present 
the M-estimates for fi when 0 = 7r/4. The simulations for the other unstable cases are similar. 

In our simulation study, we consider the time series {Xt}f^Q in model ( |4.1[ ), for n = 10, 20, 
30, 40, and 50 with a = 0.5, 1, 1.3, 1.7 and 2. The time series are generated 10,000 times for 
each choice of n and a. The M-estimates of fi in AR(2) are calculated for each replicate. Here, 
we use the Huber loss function, ph{x) in ( |2.1| ), with c = 5 in all cases. The sample median and 
the sample 90% inter-percentile range (IPR) (95th percentile - 5th percentile) for |0i — 2 cos 6\ are 







Sohrabi and Zarepour: Asymptotic Theory for M-Estimates in Unstable AR(p) Processes 


9 




Index 


Index 



Index 


Index 


Figure 1: Different sample paths for model (|4.1|) when a = 1.3 and n = 500. 


tabulated in Table 1. Furthermore, the simulated results of the sample median and 90% IPR for 
\(t)i — 2 cos 9\ with the LS method are shown in Table 2. As seen in Tables 1 and 2, the M-estimates 
are significantly closer to the actual values as n gets larger compared to the LS estimates, especially 
for small values of a. Note that, when a = 2, the rate of convergence of both M-estimate and LS 
estimate are the same. Moreover, the 90% IPR value for the M-estimates are significantly smaller 
than those for the LS estimates. The tables confirm fhaf M-esfimafes are more precise fhan LS 
esfimafes. 

Table 1: Median and 90% IPR (in parenfheses) for \^i — 2 cos 9\ in model ( |4.1| ) by fhe M-esfimafe 
mefhod using fhe Huber loss function 


Index of stability a 

n 

0.5 

1 

1.3 

1.7 

2 

10 

0.0258(0.4141) 

0.1170(0.4097) 

0.1501(0.4048) 

0.1839(0.4003) 

0.1947(0.3961) 

20 

0.0030(0.0815) 

0.0342(0.2494) 

0.0552(0.2981) 

0.0734(0.3456) 

0.0852(0.3673) 

30 

0.0009(0.0245) 

0.0180(0.1367) 

0.0328(0.1797) 

0.0444(0.2133) 

0.0538(0.2323) 

40 

0.0004(0.0107) 

0.0115(0.0841) 

0.0216(0.1234) 

0.0329(0.1628) 

0.0391(0.1788) 

50 

0.0002(0.0063) 

0.0083(0.0584) 

0.0159(0.0916) 

0.0263(0.1278) 

0.0318(0.1396) 


5 Bootstrapping 

The asympfofic disfribufions for fhe proposed esfimafor of <1> = (c/ii,..., (/ip)T by Chan and Zhang 
(2012) and Ibis article are generally infracfable. Due fo fhe complexify of fhe limiting disfribufions. 
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Table 2: Median and 90% IPR (in parentheses) for \^i — 2 cos 9\ in model ( |4.1| ) by the LS estimate 
method 


Index of stability a 

n 

0.5 

1 

1.3 

1.7 

2 

10 

0.0824(2.0654) 

0.1439(0.9341) 

0.1633(0.7900) 

0.1867(0.7794) 

0.194(0.7595) 

20 

0.0297(0.3620) 

0.0543(0.3295) 

0.0660(0.3382) 

0.0765(0.3541) 

0.0852(0.3674) 

30 

0.0180(0.2038) 

0.0351(0.2224) 

0.0429(0.2200) 

0.0477(0.2223) 

0.0538(0.2323) 

40 

0.0133(0.1323) 

0.0257(0.1567) 

0.0311(0.1622) 

0.0362(0.1719) 

0.0391(0.1788) 

50 

0.0105(0.1070) 

0.0202(0.1247) 

0.0243(0.1259) 

0.0286(0.1351) 

0.0318(0.1397) 


to make inferences based on 4>i,, (j)p, one may consider a resampling scheme. In this section, 
we briefly suggest using the result of Theorem 1 of Moreno and Romo (2012) and Theorem 3.1 of 
Zarepour and Knight (1999a) in model ( |1.1[ ), when {ci} is a sequence of i.i.d. random variables 
and belongs to the domain of attraction of a stable law with index a. Define the point process 

n 







y 


where Ex is a measure defined by Ex (^) 


I{x £ A) for any Borel set A 


such that A C [0,1] x M. From Resnick (1987), we have 


n 

i=l 



(5.1) 


where Ui C/[0,1] is independent from which are defined in p.4[ ). Here indi¬ 

cates weak convergence with respect to vague topology, see Resnick (1987). Now, assume that 
{ej, €2, • • • , e*} is an i.i.d. sample from Fn{x) = ^ '^2^1=1 Hu < a^)- Then, the bootstrapped point 
process 


E' 

i=l 


\i/' 




([0,f] X •) 


i=l ^ 


(•) 


(5.2) 


in distribution. Here, {T’*(f)} is a sequence of independent Poisson processes with Lebesgue 
mean measure and {()j, Tj} are the same as before. Note that in this case, the regular bootstrap 
asymptotically fails for the bootstrapped point process since the limit in ( |5.2[ ) contains the extra 
Poisson point process Tl(t). Subsampling of size m when m/n -£■ Oasn —)■ oo resolves the 
asymptotic failure of the regular bootstrap; see Zarepour and Knight (1999b). By the same ar¬ 
gument used for the stationary AR(j9) processes with infinite variance in Davis and Wu (1997), 
we can prove that the results of the limiting distributions are asymptotically valid when we use 
the bootstrap scheme with m = o(n) resampling sample size. Given {Xi,... ,Xn), we find M- 
estimates of fi) in model ( |1.1| ) using the objective function in ( |2.1| ). The residuals are calculated 
from et = Xt — (piXt-i — • • • — (j)pXt-p, t = p + 1,2,... ,n. Then, the bootstrap replicate 
{XI,..., X^) are generated from X^ = H-h y)pX^_ + , where e|,..., ejjj is a sam¬ 


ple of size m from F„(-) = ^ E7=p+i - e < ■), where e = ^ 




n—p 


=p-tl 


The bootstrap 
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replicate of $ is found by minimizing 




,Qv) = 


m 

E [m< 


-qlhz}.X, 


mi 


- - p{4) 


t=p+i 


with the minimum given by rf^ = T>m ’ where 'Em = diag [bmi , ■ ■ •, bmp) is the matrix 

of appropriate normalizing constants. The following lemma helps to derive the limiting distribution 
of bootstrap estimates. 

Lemma 1. Let {e^,..., ej^} be an i.i.d. sample from Fn and E* denotes the expectation under Fn- 
Also, under conditions A1-A4 and with the subsampling of size m where m —)• oo and m/n —)> 0, 
we have 

(i) X]l=i ^i') probability, where Sf) is the stable process defined in 

(ii) E*(V’(ei)) = 0, 

(Hi) X]!=i ^i') probability, where W{-) is a standard Brownian motion 

independent of Sf) and = ^ ^ 

(iv) E*(V’'(ei)) 4 E(?/)'(ei)). 

We omit the proof of Lemma [T] since it follows from a similar argument used in Proposition 4 
of Moreno and Romo (2012) and Arcones and Gine (1989). See also Sohrabi and Zarepour (2016). 
From Lemma [T] we can conclude that 

in probability. Here Sj^(-) is defined as in Lemmaj^and T^(-), lEm(-), 4n(')| 

is defined in p.5[ ) by replacing e, by e* and n by m. Also, {5(-), 5(^)(-), T(-), VF(-), R(- ).r(-)} 
is as specified in ( |2.6| ). Similar fo Theorem 3.1 of Davis and Wu (1997), along wifh applying fhe 
confinuous mapping fheorem, and Lemma [T] we have 

(gTG^)-i ($* - I.) 4 ((r-iT)T,(T-i2f)T,(Ar4i)T,...,(Ar4z)T)^ 

in probabilify, where (T^^T), (T“^9f), and (A“^Si) for f = 1,..., ( are specified as in Theorem 
[T] For a defailed proof of fhe asympfofic validify of fhe m ouf of n boofsfrap in a special case see 
Sohrabi and Zarepour (2016). Also see Zarepour and Knighf (1999a, b), Moreno and Romo (2012), 
Davis and Wu (1997), and Sohrabi (2016) for some similar fechniques. An illusfrafion of fhe valid- 
ify of fhe boofslrap scheme is given by a simulafion sfudy in fhe following example. 
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Example 2. We consider bootstrapping for the parameter cpi in model ( |4.1[ ) when 9 = vr/d. We 
generate the time series {Xt}2^Q in model ( |4.1| ), for the actual sample sizes n = 50, 100, and 200 
with a = 1.3, and 1.7 (the cases with a > 1 are of practical interest). The behavior of the other 
unstable cases are more or less similar. Then we implement the following algorithm for each choice 
of n and a. 

(i) We find M-estimates of the parameters </)i and (/)2 in model ( |4.1[ ) using the Huber loss function 
with c = 5. Then take 

u = Xt — — ^2Xt-2- 

(ii) We draw a sample of size m from centered residuals denoted by ,..., and find {X^ 
from ( |4.1[ ). Then we esfimafe fhe paramefer (/)i using fhe boofsfrap observafions by fhe same 
minimizafion of fhe objecfive funcfion used in sfep (i). 

(iii) We repeaf sfep (ii) B = 3, 000 times fo gef , ..., To find a naive 95% confidence 

inferval for cpi, we obfain fhe 2.5fh and 97.5fh percentiles of 3,000 boofsfrap esfimafes as fhe 
lower and upper bound of our confidence inferval. 

In order fo compufe fhe coverage rafe of fhe boofsfrap confidence infervals, fhe original lime se¬ 
ries are generated 10,000 limes for each choice of n and a. Then by applying (i)-(iii), fhe naive 
95% boofsfrap confidence interval for (j)i is calculated for each replicate. Moreover, fo sludy how 
fhe selection of fhe resampling size would affecl our esfimafion, we perform fhe second step wilh 
Ihree differenl resampling sizes m = n/ln(ln(n)), nP'^, and Finally, fhe resulting cover¬ 

age percenfages of fhe naive 95% boofsfrap confidence infervals for fhe paramefer (/>i for differenl 
values of m, n, and a are presented in Table 3. This fable shows lhal fhe coverage percenfages of 
fhe naive 95% boofsfrap confidence infervals for (f)i are very close fo 95%. This illuslrales lhaf fhe 
boolslrap scheme wilh m = o(n) resampling sample size is approximately valid when we have a 
non-sfafionary lime series wilh innovalions in fhe domain of aflraclion of a slable law. The simula- 
fion procedure shows m = performs consislenfly well in all our cases. 


Table 3: Coverage for fhe naive 95% boolslrap confidence inferval for (/)i in model (|4.1|) 


a 

n 



1.3 



1.7 


50 

100 

200 

50 

100 

200 

m 

= n/ ln(ln(n)) 

96.1% 

96.5% 

97.4% 

95.1% 

96.5% 

96.8% 

m 

= n(0'9) 

91.2% 

97.4% 

97.7% 

96.6% 

96.9% 

97.4% 

m 

= „(0.95) 

95.7% 

96.6% 

96.3% 

94.0% 

94.8% 

94.8% 


6 Conclusion 

In Ihis paper, fhe asymplolic properties of fhe M-eslimafe have been Ihoroughly sludied for fhe 
unsfable AR(p) processes when {e*} is a sequence of random variables in fhe domain of aflraclion 
of a sfable law wilh index 0 < a < 2. The behavior of Ihese lime series wilh several real and 







Sohrabi and Zarepour: Asymptotic Theory for M-Estimates in Unstable AR(p) Processes 


13 


complex conjugate roots on the unit circle, is completely different from processes with a single unit 
root. The robust M-estimate method has been used to drive the limiting distribution of estimates of 
the parameters in the AR(p) model defined in ( |1.1| ). Although the M-estimates have a better rate of 
convergence compared to LS estimates, the limiting distributions are not computationally tractable. 
To remedy this difficulty, we suggest a valid m out of n bootstrap where m/n—)-0asn—)-oo. Our 
simulation study proves that the resampling bootstrap scheme is valid for our non-stationary time 
series when errors are in the domain of attraction of a stable law. 

Based on our analysis, when errors are heavy tailed, M-estimators are always superior to LS 
estimators. This superiority holds for LAD but some extra conditions should be imposed on innova¬ 
tions. In this case, Knight (1989) and Davis et al. (1992) assume that the innovations need to have 
0 median and a density with respect to Lebesgue measure. Since the computational complexity of 
both M-estimates and LAD estimates are similar, we do not consider the LAD estimate to avoid 
imposing extra conditions. 
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7 Appendix A 


To facilitate the proof of Theorem 1, we first present the following two lemmas. The proof of 
Lemmaj^is similar to the proof given in Theorem 1 of Banjevic, Ishwaran and Zarepour (2002). 

Lemma 2. Let (Zi fc, be a sequence of symmetric i.i.d. random vectors on Z 2 = ^ 2 )^ : 

zf + Z 2 = 1} with a probability distribution Q on the boundary of the unit circle, and 






-1/a 


Then the characteristic function for X is 0(s) = exp [—iTE (|siZi_i + S 2 -^ 2 ,i|“)]. where s = 
(si, S 2 ), and K is given by 


r cos(7ra/2)r(l — q) if 0 < a < 1, 

K = < 7r(2 — a)/{2a) if a = 1, 

cos(7ra/2)r(3 — a)/(a^ — a) ifl<a<2. 

where {T^} is defined in p.4| ) and is independent from {Zj i = 1, 2. 
Lemma 3. Let be defined by p.5| ), then 


where T(-) = (ri(-), T 2 (-))^ is a bivariate stable process with index a. 


Proof. Consider the point process convergence in ( |5.1| ). By applying the continuous mapping 
theorem, for any 6 G (0, 27r), we have 


n 

^ ^ / n) / n) ,aU ^k) 

k=l 



Similar to Resnick’s (1986) Proposition 3.4, it can be shown that 


[nt] oo 

a~^'^{cos{9k/n) ,sm{ek/n))ek ^ (cos (6*C/fc), sin (^t/fc)) < f). (7.1) 

k=l fc=l 

Notice that we can easily find the finite dimensional distribution for the limiting process in (|73. 
For instance, for t = 1 and by applying Lemmaj^where (Zi^fc, Z 2 ^k) = {cos{9Uk)Sk, sm{6Uk)6k), 
and p = q = 1/2, we have 


cj){s) = exp 



I Si cos{0u) + S 2 sin(0u)|“du 


Proof of Theorem 1 with Roots equal to 1. First, we consider the unit root model 


□ 


(1 - Bfiuti-) = U- 
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To avoid singular limiting distributions, similar to Chan and Wei (1988), we define 

utU) = (1 - for j = 1, 2,..., r, (7.2) 

or equivalently ut{l) = Yl]=i ^*(7) = Ylk=i — 1) for j = 2,..., r. From definition 

we have 

an^W[nd(l) = 5'n(f) =: Sl,n(f) S{t)=:§i{t). 

Since ut{j) = '^kU ~ 1) for J = 2,... , r, the continuous mapping theorem implies that 

= / §'j-l,n{s)ds =: §j^n{t) §j{t) (7.3) 

Jo 


Jn = NY^C, 


0 0 

-1 0 

V1 (-i)r7') 

and the appropriate normalizing constant is Nn = diag ..., 

First, notice that C (X]"=r+i ut-iUt-i^V’^(ei)) = H, where the {i,jY^ entry of 11 is 

n 

TTjj = ^ ut -1 {r - {i - 1)) ut -1 {r - {j - 1)) Y'(et) for i,j = 1,2, ... ,r. 


(7.4) 

' ) 

(-i)’-i / 


for j = 2,..., r. We also define 


where 


/! 


C = 


The joinf behavior of If can be sfudied through some tedious calculations. Therefore, for simplicity, 
we only calculate the limiting behavior of each term of If individually by using the following steps. 
As discussed before, each Y'iu) can be replaced by E {ip'iu)) when n —)■ oo. To see this, note that 

n n 

^ Ut-i{i)ut-i{j)Y'Yt) = X! [V^'Cet) - E(V''(et))+E(V''(et))] 

t—r+1 i^r+1 

n n 

= ut-iii)ut-i{j) [Y'Yt) - E {YYt))] + E iYYi)) Y 

(7.5) 


is 

Op(a^n(*+-^“^/^)). Then we have 


for 


hJ 


= 1 , 2 , 


,r. 


Notice that by (|2.5|) and (7.31, the first term on the right side of (7.5 


n 

X] {u) -{Y'{u))] A 0. 

t=r+\ 


(7.6) 
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From ( |7.3| ), along with applying the continuous mapping theorem and Proposition 2 of Paulauskas 
and Rachev (1998), we can find the limiting of each term of the matrix If where 




" /■! 

Ut-i{i)ut-i{j)'ip'{et) E('!/;'(ei)) / §i{t)§j{t)dt for j = 1, 2,..., r. 


t=r+l 


Therefore, 


Jni j JT 4> r, 


\t=r+l 


where F = ( 7 *^ ) is a r x r random matrix such that 

= E (V’'(ei)) [ forf,j = l,2,...,r. 

Jo 


(7.7) 


Moreover, note that 


where 


Jn ^ ut_iV’(et) S', 

t=r+l 


J = eV 2 

Now, summarize the results to obtain the following limiting distribution 


f §r{t)dW{t), §r-l{t)dW{t),..., §i{t)dW{t) 

L^O Jo Jo 


(7.8) 


-1 


i'Jn) ^ ( X] ut-iUt-iTV^'(et) j ^ ut-iV'(et) 4 F (7.9) 

\t=r+l / t=r+l 

where F = {'yij)rxr and S', respectively, are defined in (T^i and (T^i. □ 

Proof of Theorem 1 with Roots equal to -1. Consider the following model 

(1 + BYvt = et. 

The limiting distribution in this case is similar to the case in which the time series has unit root 1, 
except that is replaced by (—l)*ei for t = 1, 2,..., t. Similar to Chan and Wei (1988), we define 

vt{j) = {I + BY~Ht fory = 1,2,... ,s. (7.10) 

Notice that ( 7.10| ) implies that (—l)*ut(j + 1) = By ( |2.5| ) and (^), we have 

which implies that 
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for j = 2,..., s. Then, let 


Kn = KX, 


where 


/ ' ° 
1 1 


C = 


V 1 ■■■ 1 

Therefore, similar to the case with root 1, we have 


and the appropriate normalizing constant is Hn = diag (n" "i"an,n^" 


Kn f ^ vt-ivt-i^ijj'iet) I /fT 4 T, 

\t=s+l ) 

where T = (r’ij) is a s x s random matrix whose entries are as follows: 

/ I 

for i,j = l, 2 ,...,s. 


We also have 


Kn ^ vt-i'tpiet) K, 


t=s-\-l 


where 


PP = _eV 2 r [\w{t)dWit), ["§?it)dw{t) 

Jo Jo Jo 

Finally, we get 

/ n \ ~ ^ n 

(KIV^ ( I vt-iV'(et) ^ T-^df, 

where T = {vij)sxs and TC are defined in ( |7.12| ) and ( |7.13[ ), respectively. 

Proof of Theorem 1 with Complex Conjugate Unit Roots. Consider the model 

(1 — 2 cos 9B + = £(. 

Similar to Chan and Wei (1988), let yt{j) = (1 — 2cos9B + B^Y~^wt for j = 1, 2,... 
(?/t(l),yt_i(l ),... ,yt-i{d),yt-i{d)y. Therefore, we have (1 -2cos9B + B‘^)yt{j + 1) = 

1 - * 

VtU + 1) = Vsin( 6 »(f -k+l))yk{j) for j = 0 , 1 ,..., d - 1. 

Sin 0 




(7.11) 


(7.12) 


. (7.13) 

(7.14) 

□ 

, d and Yt = 
yt{j), and 
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We can find a 2d x 2d matrix D such that Dwt = Yt, where wj = {wt, ■ ■ ■, Wt- 2 d+i)- For more details see 
Chan and Wei (1988). By applying trigonometric identities, we have 


sm{0)yt{j) = a„ sm((f + l)6')Ti,t(j - 1) - a„ cos((f + l)9)T2,tij - 1), 


(7.15) 


where X]l=i cos{k9)ykij) and T 2 ,t{j) = sin(fc6»)yfc(j)- Note that by Lemma|^ 

we have 

{U,m,T2MY ^ T(.) = {U{t),T2{t))\ (7.16) 

where T( ) is defined in Lemma[^ Now, consider the following representation; 

( Z]t^2d+l ytd)yt-i{d)i)'dt) \ 


t=2d+l 


\j:7^2d+iyt-iid)yt{i)P'dt) ■■■ j:t^2d+iyLiid)Udt) 


To find the limiting distribution of D {J2t=2d+i i^t)) D'^, by using Lemma 3.3.6. of Chan 

and Wei (1988) and from Proposition 8 of Jeganathan (1991), we have 


sup 

0<j<n 




k^l 


= Op{n) 


for f = 1,..., d. Moreover, by letting 


= M-i A 


(7.17) 


(7.18) 


where = diag {rT^'^anl, ■ ■ ■, and I = diag (1,1), we have 

A ( ^ wt_iWt_i‘rA(et) ) A 


Vt=2ci+1 


Here A = (Xij) m a 2d x 2d random matrix, where \ 2 i-i, 2 j-i = X 2 i^ 2 j and \ 2 i-i, 2 j = X 2 i, 2 j-i for 
i,j = 1, 2,..., 2d. By using ( |7.16| l and ( |7.17| l along with the continuous mapping theorem, we can show that 
Ai j is presented as follows: 


A2i-l,2i-l — A2i,2j — 


E(A(ei)) 


riAf-im.t(j-l)d(f) + ^ T2,t{i-l)T2,t{.j-l)d{f)^ 


A2i-1,2} — A; 


,2j — X2i,2j-1 


E(A(ei)) 


cosd(^^ Ti,i(z-i)ri,t(j-i)d(f) + ^ r2.t(f-i)T2,i(j-i)d(f)^ 
-sind(^^ ri,t(f-l)T2,t(j-l)d(f)-^ Ti,t(j-l)r2.t(f-l)d(f) 


(7.19) 


Also, we have 


Ln ^ Wt_l'!/’(ei) 4- g, 

i=2d+l 
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where 9 = (Si, S 2 , ■ • •, S 2 d)^- Note that Si for i = 1,... ,2d can be expressed as follows; 

- l)dRi{t) - T2,i(j - l)di?2(f)) 
+ sin0(^^ Ti,t(j-l)di?2(f) + ^ T 2 AJ - i)dRi{t) 
S 2 , = (i/i2(e,)) X ^ ^ T2,t(j - l)di?2(f)) , 

where R = (i?i(-), i?2(’))^ defined in ( |2.7[ ). Thus, we have the following result: 


(f-n) ^ ^ ^ A ^S 


-1 


(7.20) 


(7.21) 


— ‘2.d-\-\ / t — ‘2.d-\-\ 

where A = (Aij) and S = (Si,S 2 ,---,S 2 d)^ are defined in ( |7.19| ) and ( |7.20[ ), respectively. □ 

Proof of Theorem 1 with cross product terms. Chan and Zhang (2012) show that the limiting distributions 
of the cases involving the cross product terms converge to zero in probability. We skip the proof of this case, 
as it is similar to their proof. 


□ 








